Fundamental limitations of genomic language models for realistic sequence generation
This study demonstrates that current genomic language models fail to generate realistic synthetic genomes because they capture local sequence statistics but fundamentally lack the ability to preserve essential long-range organization, repetitive elements, and evolutionary constraints, making synthetic sequences easily distinguishable from natural ones.